Skip to content

Conversation

mouhc1ne
Copy link
Contributor

  • Implement url_encode function

Copy link
Contributor

github-actions bot commented Aug 25, 2025

🔍 Preview links for changed docs

@mouhc1ne
Copy link
Contributor Author

mouhc1ne commented Aug 25, 2025

How the new function looks in the docs.
image

@mouhc1ne mouhc1ne force-pushed the url_encode_function branch from da1095a to 1c7be9e Compare August 25, 2025 20:12
@ConvertEvaluator()
static BytesRef process(final BytesRef val) {
String s = val.utf8ToString();
String encoded = URLEncoder.encode(s, StandardCharsets.UTF_8);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two concerns I have about this snippet:

  1. This actually over-encodes the input. For example http://elastic.co becomes http%3A%2F%2Felastic.co. I wonder if we have a better option internally.
  2. Performance when passing strings around, as I see all other evaluators directly manipulating byte arrays.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2. I see all other evaluators directly manipulating byte arrays.

Many don't. We'd prefer if they did, but sometimes that's life.

I think the over-encoding is a bigger deal - we should figure out what the folks who asked for this expect from it. If you add this as a SNAPSHOT function then we have a lot of latitude to change it later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http://elastic.co becomes http%3A%2F%2Felastic.co

From what I read in the RFC, those are reserved characters and "should" be encoded. So I would say that it's fine (?)

@ConvertEvaluator()
static BytesRef process(final BytesRef val) {
String s = val.utf8ToString();
String encoded = URLEncoder.encode(s, StandardCharsets.UTF_8);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2. I see all other evaluators directly manipulating byte arrays.

Many don't. We'd prefer if they did, but sometimes that's life.

I think the over-encoding is a bigger deal - we should figure out what the folks who asked for this expect from it. If you add this as a SNAPSHOT function then we have a lot of latitude to change it later.

@ivancea ivancea self-requested a review August 26, 2025 09:30
Copy link
Contributor

@ivancea ivancea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ConvertEvaluator()
static BytesRef process(final BytesRef val) {
String s = val.utf8ToString();
String encoded = URLEncoder.encode(s, StandardCharsets.UTF_8);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http://elastic.co becomes http%3A%2F%2Felastic.co

From what I read in the RFC, those are reserved characters and "should" be encoded. So I would say that it's fine (?)

@mouhc1ne mouhc1ne added ES|QL-ui Impacts ES|QL UI test-release Trigger CI checks against release build labels Aug 26, 2025
@mouhc1ne mouhc1ne force-pushed the url_encode_function branch 2 times, most recently from 4282895 to a3f01fb Compare August 26, 2025 21:17
@mouhc1ne mouhc1ne requested review from ivancea and nik9000 August 26, 2025 22:58
@mouhc1ne mouhc1ne force-pushed the url_encode_function branch from a3f01fb to b24117f Compare August 27, 2025 11:55
@mouhc1ne mouhc1ne requested a review from ivancea August 27, 2025 11:58
@mouhc1ne mouhc1ne force-pushed the url_encode_function branch 4 times, most recently from c6fb600 to 25545d4 Compare August 27, 2025 19:32
@mouhc1ne mouhc1ne marked this pull request as ready for review August 27, 2025 19:43
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Aug 27, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/kibana-esql (ES|QL-ui)

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me too.

@mouhc1ne mouhc1ne force-pushed the url_encode_function branch from 25545d4 to 2d3cdcc Compare August 28, 2025 13:04
@mouhc1ne
Copy link
Contributor Author

The failing release-test (https://buildkite.com/elastic/elasticsearch-pull-request/builds/88528/steps/canvas?jid=0198f0c9-4928-4cf3-9939-0a867d857f5f) is unrelated to these changes. Removing test-release label.

@mouhc1ne mouhc1ne removed the test-release Trigger CI checks against release build label Aug 28, 2025
- Rename the capability to exclude the decoding portion, since this PR is only concerned with encoding.
- Mark url_encode as SNAPSHOT to allow changes in encoding logic.
- Add tests that use random strings to test a wide range of characters, that run alongside the ones that use random URLs.
- Minor change in the sample csv test string to showcase a fictional URL with path and few query params.
- Remove url_encode from docs since it's a snapshot function
- Declare it along other snapshot functions in EsqlFunctionRegistry
- Annotate UrlEncode similarly to other snapshot functions
- Add url_encode capability to csv tests
@mouhc1ne mouhc1ne force-pushed the url_encode_function branch from 2d3cdcc to 7572fc5 Compare August 28, 2025 14:52
@mouhc1ne mouhc1ne merged commit 875e6f3 into elastic:main Aug 28, 2025
33 checks passed
mouhc1ne added a commit to mouhc1ne/elasticsearch that referenced this pull request Aug 28, 2025
- Add url_decode as a snapshot function.
- Move common logic to both encode/decode to a parent class
mouhc1ne added a commit to mouhc1ne/elasticsearch that referenced this pull request Aug 29, 2025
- Add url_decode as a snapshot function.
- Move common logic to both encode/decode to a parent class
mouhc1ne added a commit to mouhc1ne/elasticsearch that referenced this pull request Aug 29, 2025
- Add url_decode as a snapshot function.
- Move common logic to both encode/decode to a parent class
mouhc1ne added a commit to mouhc1ne/elasticsearch that referenced this pull request Aug 29, 2025
JeremyDahlgren pushed a commit to JeremyDahlgren/elasticsearch that referenced this pull request Aug 29, 2025
* ESQL: Add url_encode function
mouhc1ne added a commit to mouhc1ne/elasticsearch that referenced this pull request Aug 29, 2025
- Add url_decode as a snapshot function.
- Move common logic to both encode/decode to a parent class
mouhc1ne added a commit to mouhc1ne/elasticsearch that referenced this pull request Aug 29, 2025
mouhc1ne added a commit to mouhc1ne/elasticsearch that referenced this pull request Aug 29, 2025
@mouhc1ne mouhc1ne self-assigned this Sep 3, 2025
phananh1010 added a commit to phananh1010/elasticsearch that referenced this pull request Oct 2, 2025
BASE=37f65b0bf8eb0f75ea696ad136eac0bd50005330
HEAD=7572fc5993ae95ef569fd2283666951cb387ff78
Branch=main
phananh1010 added a commit to phananh1010/elasticsearch that referenced this pull request Oct 6, 2025
BASE=37f65b0bf8eb0f75ea696ad136eac0bd50005330
HEAD=7572fc5993ae95ef569fd2283666951cb387ff78
Branch=main
phananh1010 added a commit to phananh1010/elasticsearch that referenced this pull request Oct 8, 2025
BASE=37f65b0bf8eb0f75ea696ad136eac0bd50005330
HEAD=7572fc5993ae95ef569fd2283666951cb387ff78
Branch=main
phananh1010 added a commit to phananh1010/elasticsearch that referenced this pull request Oct 17, 2025
BASE=37f65b0bf8eb0f75ea696ad136eac0bd50005330
HEAD=7572fc5993ae95ef569fd2283666951cb387ff78
Branch=main
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >enhancement ES|QL-ui Impacts ES|QL UI Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants